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Abstract 

This paper revisits the problem of recovering a smooth, isotropic, layered wave speed profile 
from surface traveltime information. While it is classic knowledge that the diving (refracted) rays 
classically determine the wave speed in a weakly well-posed fashion via the Abel transform, we 
show in this paper that traveltimes of reflected rays do not contain enough information to recover 
the medium in a well-posed manner, regardless of the discretization. The counterpart of the 
Abel transform in the case of reflected rays is a Fredholm kernel of the first kind which is shown 
to have singular values that decay at least root-exponentially. Kinematically equivalent media 
are characterized in terms of a sequence of matching moments. This severe conditioning issue 
comes on top of the well-known rearrangement ambiguity due to low velocity zones. Numerical 
experiments in an ideal scenario show that a waveform-based model inversion code fits data 
accurately while converging to the wrong wave speed profile. 
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1 Introduction 

1.1 Problem setup and context 

We consider the ray-theoretic limit of high-frequency waves propagating in a slab < z < h, made 
of a heterogeneous layered medium with smooth isotropic wave speed c{z) . We assume that waves 
can only be sent from, and recorded at the surface z = 0. Without loss of generality the waves 
are assumed to originate from the origin x = z = 0, as all points are equivalent on the surface. 
The transverse coordinate x is assumed to be one-dimensional, as otherwise the problem would be 
radially symmetric about x = 0. We also assume that all other physical parameters that may affect 
wave dynamics, such as density, are constant. 

The information available for the inversion is the traveltime r of the various waves as a function 
of the recording position x. The two types of waves in a layered slab are 








diving, or refracted waves, which arrive back at 2; = from 
overturning before reaching z = h; and 



h 



z 



transmitted waves, which arrive at z = h. The first reflected 
wave, recorded at 2; = after reflecting off of the boundary z = 
h, arrives twice later and twice farther than the transmitted 
wave, hence contains the same information. Multiply reflected 
waves also do not carry any new information. 



Diving waves occur for instance when c{z) is monotonically increasing. We assume for sim- 
plicity that the type of a wave (diving or reflected) is a priori known. Waves that do not reflect 
(such as diving waves) are usually called "transmitted" in the geophysics literature, so the word 
"transmitted" is used very sparingly in the sequel to avoid confusion. 

The inverse problem of recovering c{z) from the traveltime information of diving waves was 
solved circa 1910 by Herglotz [18j, Wiechert and Geiger |38j, and Bateman |4j in what is perhaps 
the first contribution by mathematicians to seismology. Their explicit formula takes the form of an 
inverse Abel transform and has been textbook material for a long time [21 [28l |3l [Ml ESj • It will be 
reviewed in this paper, along with the analysis of its stability. 

What can be said about the corresponding inverse problem for reflected waves? Many authors 
have argued that this problem is quite different from that concerning diving waves. Firstly, there 
may not be an explicit formula to solve the problem. But more importantly, the problem has 
a completely different stability behavior. Qualitative discussion of ill-posedness of traveltime to- 
mography was an active topic in the geophysics community in the late 1980s and early 1990s, see 
for instance Stork and Clayton [Ml [33]; Bube, Langan, and Resnick [HI [3 [8]; Ivansson [19]; and 
Delprat-Jannaud and Lailly |1H I12j . This paper aims to settle in a quantitative manner that, 
regardless of the discretization, there is not enough information in traveltime data from reflected 
rays to solve for a velocity profile c{z) in a well-posed manner. 

It is clear that in the one-dimensional case of a ray traveling from z = to z = h (with 
h known), there is a fatal obstruction to solving the inverse problem. The only datum is the 
traveltime r = l/c{z)dz, hence two smooth profiles with the same slowness integral will be 
indistinguishable. One would think that passing to a multidimensional situation may allow to 
recover a well-posed problem by triangulation — the option to observe a fixed scene from different 
angles — but that is not the case. Perhaps surprisingly, the presence of rays with different take-off 
positions and angles only marginally improves the determination of the velocity profile, at least in 
the layered case. 

Determining a velocity profile form traveltime data is a nonlinear problem. As we will see, both 
in the diving and the refiected case, the forward model can be split into the composition of two 
operations: 

• a nonlinear operation of mapping the velocity profile to its decreasing rearrangement, which 
is invertible when the function is monotone but not otherwise; followed by 

• a linear integral operator acting on the inverse of this rearrangement. It is that integral 
operator which is invertible and relatively well-conditioned (of Volterra type) in the diving 
case, but always ill-conditioned (of Fredholm type) in the reflected case. 
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The possible lack of invertibility of the nonlinear step is well-understood: geophysicists refer to the 
lack of (increasing) monotonicity of the velocity profile in z as the "presence of low- velocity zones" 
Pl [28] . The characterization of the conditioning of the linear step seems to be less well understood 
and is the subject of this paper. 

It should be mentioned that the meager results in this paper are far from shedding adequate 
light on the bigger problem of solvability and well-posedness of the general traveltime tomography 
problem, also called boundary rigidity problem. Much progress was obtained on this question 
recently; see [3Tj and other upcoming publications by Uhlmann et al. For us, settling the layered 
special case serves to explain some disturbing numerical results that were observed in the scope of 
finite-frequency inversion of the background velocity in an idealized layered seismic setup. What we 
originally thought should have been a simple test case has now revealed itself to be a pathological 
example that cannot be solved. We hope it is useful to record this observation for the benefit of 
the community. We present a numerical example to this effect in the last section of this paper. 

Although all our arguments assume a layered model v{^z\ it is clear that they have an equivalent 
formulation in the radially symmetric case v{r^ via the so-called Earth flattening transformation. 
This is the original setting for the Herglotz-Wiechert formula. 

It is a nice coincidence that some of the mathematics reviewed or used in the proofs originates 
from the first half of the 20th century, and should be credited to such first-rate analysts as Herglotz, 
Bateman, Hardy, Littlewood, and Szego. 



1.2 Kinematics 

In this section we review the solutions of the Hamiltonian system of geometrical optics in a layered 
medium. This classical material is covered in many places, including at least [39} [28} [21 [3]. Let 
X = (x, z) for position and p = {px,Pz) for slowness; then 

x(t) = c(x(i))P('^ 



IpWI' 

p(t) = -Vc(x(t))|p(t)|. 
Since c only depends on z, it follows that horizontal slowness is conserved and equals 

cos ^0 

Px=P= , 

Co 

where is the take-off angle that the ray leaving from the origin makes with the surface z = 0, 
and Co is the wave speed there. We now slightly abuse notations and write c{z) for the wave speed. 
The rest of the system can be solved by writing 

^ \p{t)\'=p'+pl{t), 



c^{z{t)) 

isolating Pz{t) = \/l/c^{z(t)) — p^, and using this expression in the equation for z(t) to obtain 

m = c{z{t)ui-p^c^{z{t)). 

Solving this ODE by separation of variables gives the expression of the traveltime r as a function 
of z and p: 

T{z,p)=j — -dz, v{z,p) = c{z)y^l -p'^c^{z). (1) 
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The handy notation v{z,p) refers to the vertical velocity. Returning to the equation for x{t), we 
get 

hence the horizontal position of the ray as a function of z and p is 

The formulas ([T]) and ^ can be used as is for transmitted rays, by letting z = h in the upper 
bound of the integrals. Reflected rays obey the same expressions with a leading factor of 2. For 
short, we write t{p) and x{p) when z = h. 

The traveltime and arrival position of a diving (refracted) ray, however, are obtained by following 
the ray until it reaches a turning point and then returns to the surface 2; = 0. A ray will turn if 
v{z,p) = 0, i.e., if it reaches the first z = Z{p) where 

c{Z{p)) = 1/p. 

Then, for diving rays, 

t{p) = 2 / dz, x{p) = 2 / ^^-^ dz. (3) 

Jo v{z,p) Jo v{z,p) 

Data normally come in the form of one or more functions Ti{x) of the transverse position x, 
but let us now explain how to introduce p in this picture. Regardless of whether the ray is diving 
or reflected, the take-off angle is by symmetry the same as the angle that the ray makes with 
the surface z = when recorded there. Hence p = cos6/c at the arrival point as well. It follows 
that p is the rate of change of traveltime as a function of x: 



Pi 



T'i 



where i indexes the branch of the possibly multivalued traveltime. By inverting this relation we 
get the (unique) function x{p). In turn, we get t{p) = Ti{x{p)). The step of forming x{p) from 
its inverse function(s) may be numerically complicated, but does not in principle suffer from ill- 
conditioning. Hence in this paper we assume that r(p) and x{p) are given. 

It is a unique feature of layered media that the knowledge of Ti{x) implies that of the full 
scattering relation, i.e., of the take-off slowness vector in addition to the traveltime of each ray. 

1.3 Diving rays and the slowness distribution function 

The next (classical) step in solving the inverse problem is to change variables in ([s]). As long as 
c{z) is an increasing function of 2;, the relation c{Z{q)) = 1/q introduced in the previous section 
defines the unique inverse function Z{q). We then consider the Jacobian of z with respect to q^: 



dz _ \Z'{q)\ 
dq^ 2q 
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For diving rays, the two relations in equation ([3| become 

/■po I rpo I 
T{p) = 2 / {q^F{q)) dq, x{p) = 2 / {pqF{q)) dq. (4) 

The upper bound is po = l/co; which is indeed greater than p when c{z) is increasing. We see 
that t{p) and x{p) in principle carry the same information; from the mathematical prospective it 
is sufficient to focus on r(p) to determine 

It was the contribution of Herglotz [18] , Wiechert and Geiger [38j , and Bateman [3] to recognize 
that either of these Volterra integral relations can be reduced to the Abel transform, hence can be 
inverted in an explicit manner. In terms of t(p), for instance, 

1 1 r/'T 

nq) = — -J r-, — -, -r(p)dp- (5) 

T^q Jq ^Jp^ - g2 dp 



Putting numerical considerations aside, the solution of the inverse problem is now clear: 1) 
determine F{q) from equation (jsj), and 2) find c~^{z) as the inverse functionof = f^'' 2qF{q)dq. 

Although it is not the main topic of this paper, we detail for completeness the stability properties 
of the Abel inversion formula ([S]) in Section [2| In a nutshell, the Abel transform is a half-integral, 
hence its inverse is a half-derivative. As a result, the inverse Abel transform is very mildly ill- 
posed. It is bounded between Lipschitz spaces with orders differing by 1/2, and its singular values 
correspondingly decrease like n~^/^. 

The reasoning leading to an integral over slowness q can be extended to non-monotonous c{z) 
if we understand that F{q) is now the slowness distribution function (SDF), 

rZ{p) / I \ 

This formula should convey the idea of a "continuous histogram". The integrated or cumulative 
version of F is sometimes called the "layered cake" representation by analysts. If F(-;p) is too 
singular, the integral over q should be understood in the sense of Stieltjes. Note that the dependence 
of -F on p did not all of a sudden become crucial; the upper bound can be increased without 
consequence in the zone where c~^{z) < p. What matters is that the integral avoids large z > Z{p) 
for which c~^{z) increases back to p and beyond; those values of c^^ should not be counted in the 
SDF. 

The expressions Q still hold with F{q;p) in place of F{q) in more general situations when c{z) 
may not be monotonically increasing. Since r(p) and x{p) only determine F, the main obstruction 
to solving for c{z) is clear: any two profiles c{z) which have the same SDF F{q;p) will give rise to 
the same data t(p), x{p). In other words, if c{z) is the solution of the inverse problem, so will any 
(smooth) rearrangement within the interval [0,Z(p)]. This rearrangement ambiguity does not pose 
a problem when c{z) is increasing, but does in case c{z) decreases before increasing back to 1/p at 
depth Z{p). This explains the remark on "low- velocity zones" in Section [l.l[ 

^It would be foolish in practice to ignore the position x[p). If F{q) is determined from r(p) alone, the deviation 
of x{p) from its integral expression above is at least an important indication of how well the assumption of layered 
medium is satisfied. 
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In addition to this rearrangement ambiguity, let us keep in mind that Hmited angular coverage 
is another major reason why practical traveltime tomography falls outside the scope of the Abel 
inversion formula. In fact, it is plausible that the techniques developed in the next sections would 
also help quantify the extent of the ill-posedness of the limited-data diving ray tomography problem. 



1.4 Reflected rays 

For reflected rays, the situation is very different since we are back to formulas ([T]) and ^ with h 
in place of z in the upper bound of the integrals. As a result, the (/-integrals for reflected rays, in 
terms of the SDF introduced in the previous section, take the form 

t{p) = 2 r J— iq'F{q; p)) dq, x{p) = 2 T J— {pqF{q; p)) dq. (7) 

Jp -p Jp yq -p 

The integral bounds depend only on the medium properties; they are 

1 _ 1 

p = min —r-r-, P = max —r^- 
- 2e[o,/i.] c[z) 2e[o,/i] c{z) 

The bounds and p* on the horizontal slowness variable p, on the other hand, relate to the 
availability of data. They are the slownesses for which the traveltime information of the reflected 
rays is known in the interval [x{p^),x{p*)]. The relations between the various remarkable slownesses 
are summarized in Figure [T| 







PqP 



Figure 1: Relative sizes of the remarkable slownesses, in the case of reflected rays. The interval 
[p,p] is that of physical slownesses, i.e., l/c{z) for some z. In particular, pQ = l/c(0) belongs in 
this interval. The interval is that of observed horizontal slownesses, i.e., quantities of the 

form pq cos where 9 is the angle that the incident ray makes with the surface z = 0. Note that in 
the case of diving rays, the two intervals would overlap and p* > p. 



The lower bound p is now fixed, so we are dealing with first-kind Fredholm equations instead of 
Volterra equations for F {q; p) . This results in severe ill-conditioning of the inverse map in the case 
of reflected rays. 

The conditioning of the linear map in ^ depends on how p* relates to p. This information is 
captured by the smallest angle ip that reflected rays make with the lines z = const., whose cosine 
is 

/ P* 
cosip = — . 

p 

The larger this angle, the worse the conditioning of ([T]), i.e., the more unstable the inverse maps. 
But the problem is already ill-conditioned even if i/j = 0. 



6 



We will need the following functions of p,p,p^, and p*: the contrast e = {p/p)"^, and 

1/2 



p* = 1 + 2 



p* = 1 + 2 



1 - {p*IpY 

e - 1 

1 - {pVp? 

e - 1 



1 + 2 



1 + 2 



1 - {p*/py 

e-l 



1 



1 - iP*/p ? 
e- 1 



and 



P* - P* 



p* + + 2 
P* - P* 



1/2 



1 



1/2 



It turns out that both a and /3 are increasing functions of ip (when p^ is fixed) . Notice that a = 1 , 
or /9* = 1, if and only \i ip = {). We will see below that a gives rise to a lower bound ~ on the 
condition number, while (3 yields an upper bound ~ with N yet to be defined. These formulas 
may look complicated, but there are numerical indications that the upper bound generated by (3 is 
tight. 

In the sequel we limit ourselves without loss of generality to the equation for t{p). We let A 
for the linear map from F to r in ([7|. 

The notion of condition number is meaningful only if the number of degrees of freedom is 
limited. Most sampling schemes of interest would discretize the operator ^ as a M-by-A'^ matrix. 
Specifically, let 

• Vm be an orthogonal projector of rank M on L'^{p^:,p*); and 

• Qn be an orthogonal projector of rank N on L?'{p,p). 

It is natural to consider Amn = 'PmAQm, and its condition number 

{Amn) 



k,{Amn) 



\A 



MN\\2 PMArl|2 



{Amn) ' 

where + denotes the pseudo- inverse, and a denote the singular values. 

Definition 1. A discretization is any couple {Vm, Qn) of orthogonal projectors. It is called rea- 
sonable for A if 

WVMAQNh > lUh- 



Theorem 1. Assume p-t > 0. Let k.^ be the condition number of VmAQm ■ 

(i) For all discretizations {Vm, Qn), reasonable for A, and such that Qj^ has rank N , 

KN> CaNa^. 

It also holds that (useful when a = 1) 



(8) 



(9) 
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(ii) There exists a discretization ("Pat, Qn), where both Vn <ind Qn have rank N , such that 

<CpN^'^ . (10) 

Furthermore, Qn can be taken as the orthogonal projector on polynomials of p^ , of degree 
N -I. 

We do not believe that the prefactors A^, N~^^'^ and N^^^ are sharp in any way. We give 
numerical indications in Section [I that the rate of ^ may be sharp, while the constants in te^ 



and ^ are not. That ([9]) should scale root-exponentially when a = 1 seems adequate, however. 

It should be mentioned that the positivity constraint on F{p) or F(p; q) may be an important 
piece of information for solving the inverse problem in practice, but that it is does not call the 
linear stability analysis into question. As soon as F{p) is strictly bounded away from zero on its 
support, no small perturbation can compromise positivity, hence the linearized perturbative theory 
applies. 

Note that once F{q;p) is found, there may still not be a unique c{z) that corresponds to it. The 
ambiguity of smooth rearrangements in the "low velocity zones" is as much an issue for reflected 
rays as it is for diving rays. While the source of ill-conditioning is now twofold, we present a 
numerical example in Section [5] where the ill-conditioning detailed in Theorem [T] is in fact more 
problematic than the strict lack of uniqueness arising from the rearrangement ambiguity. 

1.5 Small p asymptotics 

We know from the previous section that there is a wide range of kinematically near-equivalent 
velocity profiles if one only considers data from reflected rays. It is possible to describe this range 
of velocities quite well in the case of small p, i.e., small offset between source and receiver. 

When p = (rays perpendicular to the layering), the problem is one-dimensional and the data 
reduce to the single number r = jj^ c^^(z) dz. Any two smooth velocity profiles ci, C2 such that 

/•h j-h 

1 c'^^{z)dz= 1 C2^{z)dz 
Jo Jo 

will appear indistinguishable. 



The extension of this observation to the case of small p > is that traveltimes will be nearly 
equal provided the integrals of the odd powers of c are identical, namely 



ph rh 

/ cf'-^iz) dz= cf'^^iz) dz, for < n < d 
Jo Jo 

for some (small) integer d > 0. Then the traveltimes for p £ [0,p*] will match up to a remarkably 
small 0{{p*) ). A justification is given in Section Wl The linearized version of the conditions 
above was found by Ivansson jl9j. See also the paper [T] by Bube for a more extensive study of the 
slowness nullspace in the linearized regime. 
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2 Theory for diving rays 



In this section, we briefly review the Herglotz inversion formula ([s]) and what is known about its 
stability properties. 



It is convenient to change variables as x = p'^, y = and express Q via an operator A as 

1 



Af{x) 



f{y) dy, 



X > 0, 



(11) 



with g{x) = T{y/x) and f{y) = yF{^). This relationship between / and g is (a version of) the 
Abel transform, or Abel equation [26]. It is also called a Riemann-Liouville integral. 



The key to the inversion formula for A is to notice that it is an operator of fractional differen- 
tiation of order 1/2 on the half-line. Indeed, 



with 



k{x,y) 



Kx,y)f{y) dy, 



dz, 



\J{z- x){y - z) 

X + y — 2z 



arctan 



TT. 



As a result, A 



2 _d_ 
dx 



-TT, hence 



A' 



2^{z-x){y-z) 



TT dx 



The inversion formula ([s]) follows. 
One can also quickly notice that 

A^ e"^ = 

from which one can (correctly) infer that 

Ae'"" = 



TT 



7r\i/2 

7J ' 



Re(s) < 0, 



Re(s) < 0. 



(12) 



This leads to the well-known fact that A is diagonal in the Laplace domain, that its powers form 
a semi- group, and that A^^ can also be computed via the scaling y^—s/ir in the Laplace domain 



[26]. This procedure is not advisable numerically due to the ill-conditioning of the inverse Laplace 
transform. 



Equation (12) also carries the information that any z such that Re(z) > is an eigenvalue of 
A, with square-integrable eigenfunction. The spectral theory of nonnormal operators such as A 
is however quite pathological, so this observation is rather useless. A natural finite dimensional 
approximation of would be a highly defective upper-triangular matrix with constant entries on 
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and above the diagonal. Eigenvalues are not the right tool to study stability under inversion for 
such nonnormal operators or matrices. 



Singular values, however, are perfectly informative for stability. The following result gives an 
explicit singular value expansion of A in the illustrative case when A acts on functions supported 
in X G [—1, 1] - otherwise some rescaling needs to be done. It should be credited to Johnstone and 
Silverman who proved a very similar result in [20] . 

Theorem 2. ( Johnstone- Silverman) Assume f & C{[— 1,1]) . Then 

^f{x) = ^ Un{x)an{Vn, f)r, (13) 



with 



Unix) = (1 — x^)^''^ Un{x), (Un are Chebyshev polynomials of the second kind,) 
Vn{x) = i — - — j P,^'^\x), (Pji''^^ are Jacobi polynomials,) 



On 



I 1 \ -1/2 

n + 1 ^ ' 



2 

and the inner product is 



-1 

{f,9)r= / J{x)g{x){l + x)dx. 



Remark 2.1. Notice that the Vn are an orthonormal basis for the "right" inner product {■ 
whereas Un are an orthonormal basis for the "left" inner product 

1 2/l + x^l/' 



= y ^/(a;)5'(2;);^ 1^^— dx. 

The particular values of the cr„ depend on the normalization of the inner products, but their decay 
rate ~ n~^l'^ does not. 



Proof. One first establishes that Avn = CnUn, from which (13) follows by orthonormality and 
completeness of the Vn ■ The proof is a matter-of-fact induction argument which combines equation 
22.13.11 in with the relations (2n + 1)P^°'^^ = (n + 1)P„ + nPn-i; r„ = C/„ - xUn-i; and 
Un+i = 2xUn — Un-i- Johnstone and Silverman claim that there is a less artificial way of obtaining 
relations such as Avn = cTnUn via hypergeometric functions. □ 



Since the singular values an decay like n"^/^, so will the singular values of any good discretization 
of A. As a result, we can expect that a matrix discretizing ^ on points would have a 0{^/N) 
condition number. This qualifies as very mild ill-posedness. 

We may also understand the stability properties of A^^ through boundedness estimates in 
adequate functional spaces. Consider the Lipschitz space Lip (a) of functions with bounded a 
semi-norm [13] 

^ f sup,^, ifO<a<l; 
1 ll/(L«J)||^_L„j ifa>l. 
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Hardy and Littlewood studied boundedness of fractional integration on Lipschitz spaces |17j . 
Their conclusion for A, properly modernized, is that for all / G Lip(a), a > 0, 

P/IU+i/2 < cii/iu, p-VlU-1/2 < cii/iu. 

This result again showcases the mild ill-posedness of inverting A. 

3 Theory for reflected rays 

In this section we prove Theorem [!} As in the previous section we change variables as x = , 
y = g^, to obtain 

I'm ^ 

f{x)=l , g{y)dy, xe[x^,x*], (14) 

Jx vy-x 

with 

f{x) = T{^/x), g{y) =yF(^]^/x), [x,x\ = \p^,p\ [x^,x*] = [pl,p*\ 

The relationship between the bounds is the same as earlier, namely x^ < x* < x < x. Since all 
X and y are bounded away from zero (from the assumption j?* > 0), considering the linear map 
between / and g — rather than that between r and F — changes the condition number by a factor 



independent of M, N. Hence it suffices to prove the claims of the theorem for (14). We overload 



notations and reuse the letter A for / = Ag as defined by (14). Note that this equation is quite 



different from (11). 



The singular values of A are the square roots of the eigenvalues of A* A, 

A*Af(y')= r k{y',y)f{y)dy, k{y' ,y) = f \, --dx. 



, ^{y' - x){y - x) 

The kernel integrates to k{y',y) = — 21og(2(-v/7/^^+-^/y^^))|!^*, which is clearly Hilbert-Schmidt 
on [x, x]^ even in the case when x = x*. Hence A* A is a, compact operator. As a consequence of 
the general theory, it has a discrete set of eigenvalues (with square-integrable eigenvectors) which 
can only accumulate at the origin. 

3.1 Legendre expansion of the kernel 

A key to understanding the spectrum of A* A is that l/\/y — x has an explicit expansion in terms 
of the Legendre polynomials rescaled to the interval [x, x] . Consider the new variables 

x + x x — x T, — X S — y 

^ = 1 A = , X = — — , y = — 7 — . 

2 2 A A 

If Pn{y) denotes Legendre polynomial of degree n with y G [—1, 1], then 



/n+ 1/2 

Pn{y) = \l ^ Pniy), n>0 



is an orthonormal basis for [x, x] with measure dy 
The desired expansion is 

1 / 2 
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where p < 1 is related to 5: > 1 through 



X = ^ p = X 



_ 1_ (16) 



Equation (15) is a straightforward consequence of the fact that 1/y 1 — 2yp + p^ is the generating 



function of the Legendre polynomials Pn{y)- Equation (16) is part a change of variables to and 
from elliptical coordinates in the complex plane; p~^ is the elliptical radius of the Bernstein ellipse 
passing through x > 1, with foci at ±1. Ultimately, it is well-known that the speed of convergence 



of a series like (15), or of the corresponding Chebyshev series, depends on the distance of the 



singularity at x to the interval [—1, 1] in the complex plane. 



The rescaled polynomials Pn{y) provide a unitary change of basis for A* A. Using equation (15), 
it suffices to find the eigenvalues of the semi-infinite matrix 

j-X j-X 

Km iy')kiy'^y)Pn{y) dy'dy', 



2A ( (n + ^)(m+ ^) ) / S: - \/ x^ - 1 



-1/ 



X _ 



m+n+1 



dx, m, n > 0. 



Further passing to the p variable via (16), it follows that Km,n is (up to the normalization factor) 
a Hankel matrix of moments: 

Km^n = A ((n + ^)(m + ^)) ^ p^^^dp^p), (17) 

with density 

^{p) = P'^ - P, P^ [p*,P*], 



where the bounds and p* relate to x* and respectively through (16); in particular p* 
X* - \/xl - 1. Note that < p* < 1. 

We now address the decay of the eigenvalues of K and of its finite-dimensional sections. 
3.2 Coarse lower bound (Isl) on the condition number 



In this section we start by assuming p* < 1 <^=^ ^ > 0, i.e., the rays never become horizontal and 
the kernel k{y,y') is bounded. 

The following two elementary lemmas detail how to deal with finite-dimensional projections 
of compact operators. Their proofs are nice homework exercises involving the Courant-Fischer 
min-max principle. In what follows eigenvalues are sorted in decreasing order, and projectors are 
considered on the domains over which they make sense. 

Lemma 1. Let Vm o-nd he two orthogonal projectors. Then for all j > \, 

HQnA*VmAQn) < Xj{A*A). 
Lemma 2. Let TZn be an orthogonal projector of rank N . Then 

Xn{A*A) < Ai((/ - nN)A*A{I - TZn)). 
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Given our N-hy-N matrix Aj\/^jv = 'PmAQn, where Vm and Qjy are arbitrary, we use the two 
lemmas above to obtain the bound 



Xn{AIjj^Am,n) < -'Rn)A*A[I -IZn)). 



We choose IZn to be the orthogonal projector onto polynomials of degree A^ — 1 in [—1,1], i.e. 
(/ — TZis[)A* A{I — IZm) is unitarily equivalent to the semi-infinite section m,n > N of the semi- 



infinite matrix K in equation (17). 



The largest eigenvalue of this semi-infinite section is further bounded by the Hilbert-Schmidt 
(Frobenius) norm. 



Al((/-7^^)^M(/-7^;v)) < 



m,n>N 



2 



1/2 



By elementary maj orations involving geometric series, there exists C > such that the quantity 
above is less than 



i-{p*Y' 

For yO* < 1, it follows that the iVth singular value of Am,n obeys (C is a number that changes from 



line to line) 



UN {A 



MM, 



<Cy/AN-^ 



*\N 



As for the first singular value, we use the assumption that the discretization is reasonable for 
A (Definition [l| to obtain ai{AM,N) > C a/A > where C is independent of A^. We assemble 
inequality ([s]) by considering that 



k{A 



M,N) 



o'i{Am,n) 
o'n{Am,n) 



and noticing that p* = 1/a. 



So far we have assumed a < 1 <=^ ip > 0, but it is clear that the result is also true (and 
somewhat uninformative) when a = 1. The justification of this fact is a very special case of the 
analysis in the coming section. 



3.3 Fine lower bound rtol) on the condition number 



In this section we consider the worst case scenario when p* = p, or equivalently /?* = 1, or ^ = 0. 
A fortiori the bounds we derive here also hold for any < p* < 1. 

The proof idea for (9]) is that the interval [x,x] can be subdivided into subintervals of the form 
Ij = [x,x] n [(1 -|- 5^~^ )x, (1 + 6^)x], with j = J, J + 1, . . . and for some 5 < 1. Here J is the 
largest integer such that (1 + 5'^)x > x. The operator A correspondingly splits into the sequence of 
operators Ajf = Axi^f ■ In accordance with the notation for matrix multiplication we suggestively 
write A = (^j,...,^j,...). 



13 



The coarse bound ([8j) can now be applied to each Aj. The same reasoning as in the previous 
section apphes, yielding 



6^ + 5^+^\ , [53-5^+^ 
-J ' = a 

5j — X 1 + 6 



= X ( 1 + j , = X 



1 - V6 



Hence the sequence of singular values of Aj obeys 

T-l ^1^3 > 



J 

N 

/I . / X \ 

< D {V6) 



i-VT 

where D is some re-usable constant which depends on 6 and x, but not j and N . For short we let 

The recombination of these various sequences, indexed by j, is heuristically done by concate- 
nation. The precise statement is the following inequality due to Weyl. 

Lemma 3. (Weyl) Consider partitioning a compact operator A as {B,C). The singular values of 
A, B and C are related by 

ah,+i{A) < al,{B) + a]^,{C), i,j > 0. 

Proof. Write AA'^ = BB^ + CC'^ . Apply Weyl's inequality to this sum of Hermitian compact 
operators [Ml E] : 

The eigenvalues of AA^ , BB^ , CC^ are the squares of the singular values of ^, B, C respectively. 

□ 



Let us apply this inequality recursively. Let K > J {to be determined) and Uj be integers such 
that XljLj = ^ ■ Then 

K oo 
j=J j=K+l 

The last term is seen to be 

oo 

al{Aj)<D6^. 

j=K+l 
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The numbers rij are chosen so that each term a^._^_i{Aj) < d^rf'^^ is also on the order of 5^ 
(up to a multiphcative constant that depends on 5, x, but neither j nor K.) For this purpose it is 
sufficient to take 

rounded off to the nearest smaller integer. 

The sequence (nj, . . . , nx) sums up to a number less than or equal to N provided 



Choosing K the largest integer smaller than J — 1 + log^ 6 will do. As a result 



a%{A) <DK6'^ <D^/N (^^^/^i^^ 
Here and earlier, the proportionality constant D depends on 5 and x, but not N . 



We now address the choice of < 5 < 1. The number put to the power V-^ above has for 
logarithm 

log (^^2/^1^^ = 2Vlog77 log 5, 

with rj = The sharpest bound is obtained when log 77 log 5 is minimized as a function of 

6 G [0, 1]. Numerically, this happens when 

<5 = 0.1716... 

In that case 

aN{A) < DN'/^ (^i/^i^)^, = 0.2875... = e-i-^^es- 

The largest singular value is lower-bounded away from zero for the same reason as in the previous 
section. The root-exponential bound on the condition number follows. 



3.4 Upper bound (10) on the condition number 



In this section we seek an upper bound on Xi{A*jy,jj^AMN), and a lower bound on \j\f{A*j^jj^AMN), 
for some particular choice of Vm and Qjv- The bound on Ai is easy: use Lemma[T]for Vm = I and 
j = 1 to obtain a bound C A, independent of N. 

Qat is chosen as the orthogonal projector in Lp'{x,x) on the (rescaled Legendre) polynomials 
of degree — 1. For the definition of the orthogonal projector it makes no difference whether 
those polynomials are orthogonalized or not. Since x = p^, Qiv is as described in the wording of 
Theorem [1} The resulting matrix QnA*'PmAQn is analogous to a finite section of AT, except for 
the presence of Vm'- 

( 1 1 



2'' 2 



I m+1/2 I 

y — \ Vm X — \/ x'^ — 1 



n+1/2 



dx, < m,n < N — 1. 
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This expression reduces to Km,n in (17) by choosing M = N, and Vm the orthogonal projector in 
L^(x*,x*), on the subspace 



span{ X — \/ — 1 



n+l/2 



S.t. X 



X — T, 



< n < - 1}. 



We are thus left with the problem of finding a lower bound on the smallest eigenvalue of each 
finite section 0<m,n<A^ — lof the nearly-Hankel matrix Km,n in ( 17 ) . This question was settled 
in the Hankel case by Szego in 1936 |_35 j , where the full asymptotic behavior as — )• oo was studied. 
Widom and Wilf, unaware of Szego's result, rediscovered it in 1966 with the same techniques [37j . 

Since our matrix K is not exactly of Hankel type (because of the factor ((n + ^){m + j)) ^^^), we 
rehearse and adapt their argument. 

We start with a beautiful characterization of the inverse of a moment matrix which, according 
to Berg and Szwarc [5J, was first discovered by Aitken [10]. Let 



Hm,n = J p"'+"dfi{p), 0<m,n<N-l 



be a Hankel matrix of moments of the positive measure fi{p). Let Ln{x) denote the orthogonal 
polynomials associated with p{x). Then is similar to the matrix G with entries 

G.m,n = 7^ r L.m{e'')Ll{e'') d9, 0<m,n<N-l. 
Jo 

The main observation of Szego, and Widom and Wilf, is that the large n asymptotics of the 
polynomials Ln{z) on the unit circle translates into the large {m,n) asymptotics for Gm,n- 

Lemma 4. (Szego-Widom-Wilf) Assume that supp p is a finite interval [p*,p*] C M+, and that p 
does not vanish on its suppor^ Then 

Gm,n = l{m + nY^lT^'^ + o((m + nY^lT^''), 

for some 7 > 0, and where 

1/2 



P - P* 



p* -p* 



(18) 



The next step is to approximate the eigenvector corresponding to the leading eigenvalue of G. 
This is where we depart from \35\ I37j. Our matrix of interest is not H^^ but 

{K-^)m,n similar to ^(m + ^)^/^Gm,n{'>^ + ^)^^^- 
Consider approximating the finite section 0<m, n<A^ — las 

^(m + ^)^/^Gm,n(n + ^)^/^ = Lm,n + Rm,n, 



^ Szego requires the weaker condition 

log 



/2 



,dx<oo. 
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where L„i,n is the leading rank-1 expression 

Lm,n = ^(2^^ - 2y^^^V„,Vn, Vn = {n + ]^^l''ir. 

It is easy to show that the spectral radius of the remainder Rm,n tends to zero as — t- cxd, so it 
can be neglected in an asymptotic sense for large N . On the other hand, Vn is the eigenvector that 
corresponds to the unique nonzero eigenvalue of the leading part L^.n- This eigenvalue obeys 

JV-l nrr r,2N 



A,(L) = ^(2Ar-2)-'/'^X:K,/<C^J 



n=0 ^ 

By taking the constant sufficiently large, this bound also holds for L + R, for all N. Specializing 
to Kjn,n, we get 



with /3 given in (18). The result on the condition number follows. 



3.5 Szego average decay of the eigenvalues. 

New ideas may be required to sharpen the constants in the lower bounds on the condition number. 
One useful piece of information could be the rate at which the determinant of a Hankel form grows 
as — oo. 



Consider = det H^^~^^\ where 

Hj^,;t'^ = r 0<m,n<N. 



Szegcj^ found the asymptotic expression 

lim = 27rexp r log fi' {h{e))de) , 



where h{6) is a map from [0,27r) to [/0*,p*], and c is called the transfinite diameter of [p*,p*] 
corresponding to this map. It is possible to choose h as a simple trigonometric function such that 

P* -P* 



4 

If we let R for the right-hand side, we obtain the explicit asymptotic formula 

Since is the product of the + 1 eigenvalues of H^^^^\ and if we postulate that these 
eigenvalues decay geometrically, then the only possible decay rate is (up to a polynomial factor) 



c" with c = ^ . ^* . The same decay rate would hold for the eigenvalues of the matrix K in (17). 



^Explained on p. 85 of [T5]. Szego is best known for proving the corresponding result for Toeplitz forms (in which 
case c — 1) when he was an undergraduate student, after Polya posed it as a conjecture. This note on the historical 
context of the Szego distribution theorem is taken from [30j . 
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The corresponding asymptotics for k jv for the finite-section discretization of the operator A would 
fohow as Kiv ~ c'^l"^ up to a polynomial factor. 

As we have no indication that the eigenvalues of sections of H or K indeed decay geometrically, 
or whether the "average" rate c~^/^ could be useful in any way toward formulating a bound on 
kn, we contend ourselves with reporting it numerically with the other bounds in Section [5] 



4 Small p asymptotics 

The obstruction to the traveltime tomography problem in the case p = (rays perpendicular to 
the layering) was covered in Section 1.5 In the more general case when < p < p* with small p* , 
equation ([T]) can be rewritten at z = /i as 

T{h,p) = / = dz. 

Jo c{z)^J\ - p^(?{z) 

Perform a binomial expansion of the inverse square root to find its Taylor expansion as 

1-1/2 ^ V" /^"^/^ 

^ n 

n=0 ^ 

The first few generalized binomial coefficients are 



, etc. 



In our case, lip is small enough |2;| = p^(?[^z) < 1. The smaller p the more accurate the truncation 
of the sum to the first few terms: 

T{h,p) = (-l)V" f ^''-\z)dz + 0{{copf''). 

(We placed the ad-hoc factor cq = c{z = 0) in the remainder to make it dimensionless. Recall that 
cqp = cos 9 where 9 is the angle that the ray labeled p makes with the surface z = 0.) It is now 
clear that if two profiles ci and C2 have matching "odd moments" up to degree d — 1, i.e. 

^cf-\z)dz= [^cf'-\z)dz, 0<n<d, 
Jo 

their responses Ti{h,p) and T2{h,p) will match to within O((cop)^'^). The name "moment" owes from 
the fact that these integrals are precisely moments of the slowness distribution function introduced 
in equation ([6|), namely 

c"\z)dz= / q'^Fiq-p) dq. 
J 

The moment-matching inverse problem is notoriously ill-posed [29]. 

The reasoning carries over without difficulty to the case of slightly differing odd moments. For 
instance, in order to get 

\Ti{h,p) -T2{Kp)\<e, ioi Q<p<p*, 
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it suffices to find the smallest d such that (cop*)^'^ < e, and require 



z) dz 



2n-l 



{z) dz 



-2n 



< e 



-1/2 

n 



for < n < d. 



This latter relation defines a rather elongated set around ci , a "ambiguity region" of kinemati- 
cally near-equivalent velocity profiles C2- In general there will exist such near-equivalent ci and C2 
for which the difference C2 — ci is non-oscillatory, i.e., contain only low wavenumbers. A numerical 
illustration of this phenomenon is shown in Section [5] As a result Tychonov regularization will 
hardly be able to discriminate between ci and C2 if they are comparably smooth: this is bad news 
for the prospect of solving the inverse problem. 



5 Numerics 



In Figure [2] we show an illustration of the various bounds on the condition number Kjy as a function 
of A^. For the particular choice of discretization made for the upper bound ( 10 ) in Theorem[l| recall 
that 



(TV) 



(TV)' 
TV 



where y^n^ is the nth eigenvalues of the size-A finite section {Km,n : < m,n < N} of the infinite 



matrix K in (17). The graphs of 1/a/ xi^^ are plotted on the same picture as a function of n, with 



the different curves indexed by A^. Other discretization choices may not be linked in any way to 



the A 



(TV) 
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5.1 Bounds on the condition number 




Figure 2: Various bounds on the condition number as a function of the discretization parameter N, 
for the reflected rays setup. Notice the logarithmic scale of the y axis. Left: = and p* = 0.5. 
Right: = 0.5 and p* = 1. Dotted curve: first lower bound ([s]). Dashed curve: second lower 

(see text), as 



AN) 



bound (9). Dash-dotted curve: upper bound (10). Solid curves: plots of l/\jXn 
a function of 1 < n < for varying N. Blue crosses: Szego average rate from Section |3.5 
further comments in the text. 



See 



The curves for the bounds were scaled by an arbitrary constant, which amounts to an arbitrary 



vertical translation in logarithmic scale. The first observation is that the upper bound (10) seems 



sharp as it scales like 1/y A^'' . 

The behavior of the eigenvalues An°°'' of the infinite matrix K is given by the lower envelope 
of the eigenvalue curves. Its scaling seems to be root-exponential in the case p* = I (Figure [2| 
right panel), i.e., of the form ce~'^^ for some numbers c, d > 0. The lower bound ^ indeed scales 
root-exponentially, albeit with a different non-sharp constant in the exponential. Note that the 

ratio Y^^Ap'^^^/A^r^ in the case N ^ oo can be seen as the discretization- free condition number, i.e., 
the condition number of the best discretization which gives rise to the largest singular values. In 
that case, the projector Q„ project onto the subspaces formed by the n eigenvectors corresponding 
to the largest eigenvalues a[°°\ . . . , Ai°°\ The discrepancy between A^^ (upper end of the curves) 

and aS^) (1 ower envelope at the same abscissa) shows that the discretization defined by taking 
finite sections is far from being "best" in the sense discussed above. 



5.2 Negative implications for imaging 

Although the theory in this paper concerns rays rather than waves, the conditioning issue identified 
here also plagues the finite-frequency, waveform-based inversion problem of reflection seismology. 
A finite difference acoustic wave simulation was carried out in a smoothly increasing medium c{z) 
shown as the blue dashed curve in Figure [3j to create synthetic seismograms of reflected waves 
(not shown). The receivers cover the surface z = 0. There is a single source at x = z = 0. Note 
the "reflector" near z ~ 1300 which is responsible for the wave echos recorded at the surface. 
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The wavelength of the probing waves is about 50 m. The data corresponding to diving waves are 
discarded. The initial c{z) is the black solid curve. 

The inverse problem of determining the background velocity c{z) from these synthetic seismo- 
grams was solved using (our own implementation of) the Mulder- Van Leeuwen correlation-focusing 
method [22]. In a nutshell, least-squares based inversion - minimizing the £2 norm of the waveform 
residual - would fail because of lack of convexity of the minimization objective, but correlation 
focusing is an alternative choice of objective that does not suffer (as much) from that problem. 

The inversion procedure converges successfully so that data are fit within a few digits of ac- 
curacy. Yet the converged speed profile (red dash-dotted curve) is significantly different from the 
original "true" speed profile. Various levels of Tychonov regularization did not help in improving 
convergence. 
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Figure 3: True vs. converged velocities. Velocity profiles are functions of z only. Different levels 
of Tychonov regularization do not noticeably improve the converged model. 



The original c{z) used in the forward modeling step was chosen to increase monotonically, 
so the ill-conditioning is not due to the rearrangement ambiguity (also called "presence of low- 
velocity zones".) Instead, it is a (spectacular) finite- frequency remnant of the conditioning problem 
associated with reflected rays as studied in this paper. 

Another numerical piece of evidence for the problem associated with reflected rays is Figure 
[4j The background velocity is shown in shades of yellow and red: it is the same "true" wave 
speed profile as earlier. In white, the rays of geometrical optics were traced in this "true" medium 
(blue dashed curve in Figure |3j). In black, we traced rays in the converged "optimal" medium from 
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correlation- focusing inversion (red dash-dotted curve in Figure [sj) Notice how the transmitted rays 
reach the reflector z ~ 1300 (and then reflect) at almost the same location as the white rays with 
the same take-off angle. The diving rays are completely different, on the other hand - hence they 
contain much more information than the reflected rays. 

X (m) 




Figure 4: White and black lines are rays in the true and converged background velocity shown in 
Figure [3j respectively. The number of lines and take-off angles are same in both cases. Yellow and 
red colors represent the magnitude of the true background velocity model. The authors of RSF 
and Madagascar are gratefully acknowledged for providing the plotting routines. 

Finally, we compare the odd moments of the converged velocity proflle cy^^z) from the correlation- 
focusing method after k iterations, to those of the "true" velocity profile c[z). The table below lists 
the quantity 

Jiciz))Pdz 

for different values of p and k. 



p\k 





1 


2 


3 


-1 


-1.66e-01 


2.51e-02 


6.20e-03 


6.62e-03 


1 


1.56e-01 


-2.72e-02 


-4.84e-03 


-5.32e-03 


3 


4.27e-01 


-8.43e-02 


-7.41e-03 


-9.00e-03 


5 


6.34e-01 


-1.38e-01 


3.25e-03 


4.30e-04 


7 


7.78e-01 


-1.82e-01 


2.88e-02 


2.47e-02 


9 


8.71e-01 


-2.12e-01 


6.80e-02 


6.28e-02 
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The moments match to within a few digits after very few iterations, as they should from the 
discussion in Section HI 

6 Discussion 

We have shown that the isotropic, laterally-homogeneous traveltime tomography inverse problem 
has well-posed formulations in the case of diving rays, but suffers from incurable ill-conditioning 
in the case of reflected rays. While diving rays involve a Volterra integral equation, reflected rays 
involve a Fredholm integral equation. Intuitively, a Fredholm operator is to a rank-deficient full 
matrix what a Volterra operator is to the upper-triangular restriction of such a matrix. 

Our analysis shows that well-posedness is linked to the presence of overturning rays, i.e., rays 
whose direction is at some point parallel to the level lines of the speed profile. We do not know 
if this non-transversality condition could play a role for the analysis of the more general case of a 
laterally varying c(x, z). 

The ill-conditioned nature of the reflection traveltime tomography problem has serious impli- 
cations for imaging, even at finite frequencies. The seismic inverse problem in a smooth, layered 
background c{z) with surface data can only be only well-posed if either (1) low-frequency data is 
seriously taken into account, and/or (2) the reflectors are more or less "dense" in the sense that 
the true wave speed profile is "rough everywhere". The latter point was made precise by Symes 
who wrote a remarkable stability estimate in |34j . 

Finally, it should be mentioned that if we restrict the domain to a rectangle, and avail ourselves 
of complete data on all the sides, then the problem of recovering the wave speed from traveltime 
data becomes much better posed. For instance, Mukhometov proved a stability estimate (with loss 
of one derivative) in the case of isotropic media that deviate little from a constant [23] . 
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